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ABSTRACT 



The objective of this research is to develop a nonlinear 
regulator for an adaptive control system using backpropagating 
neural networks (BNN's) in conjunction with a linear quadratic 
regulator (LQR) . The basic concepts of adaptive control and 
the structure of neural networks are discussed. These 
concepts are integrated and the nonlinear regulator is 
derived. Simulation is conducted on a representative 
nonlinear system with both the LQR and the nonlinear 
regulator. Training of the regulator and its performance 
under varying BNN parameter values are examined. 

The simulation results show that the nonlinear regulator 
with BNN's exhibits superior performance compared to the LQR 
when the nonlinearities are large. The optimization of 
regulator performance with regard to BNN parameter values is 
discussed. 

Further research is required in order to determine the 
general applicability of this regulator and to develop more 
specific guidelines for BNN parameters. 
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I . INTRODUCTION 



A. OBJECTIVE 

The objective of this thesis is to develop a nonlinear 
regulator for an adaptive control system using backpropagating 
neural networks (BNN's) in conjunction with a linear quadratic 
regulator (LQR) . Discrete time models are used to represent 
the systems for simulation and analysis. 

B. BASIC CONCEPTS OF ADAPTIVE CONTROL 

There are inherent nonlinearities in most control systems 
due to elements such as environmental changes, minor system 
component failures, random time -varying parameters, and hard 
limits caused by physical constraints. These nonlinearities 
may be handled by an optimal control design if it is 
sufficiently robust, but with large uncertainties the 
controller has to sacrifice performance in favor of robustness 
and this might be unsatisfactory. 

One method of handling these nonlinearities is to use an 
adaptive control system. An adaptive control system is one 
which continuously and automatically measures the dynamic 
characteristics of the plant, compares them with the desired 
attributes, and uses the difference to vary adjustable system 
parameters so that optimal performance can be maintained 
regardless of the nonlinearities encountered [Ref. 1: p. 792] . 
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1. Performance Indexes 



The basis of adaptive control rests on the premise that 
there is some performance of the system which is optimal. 
Optimal performance is defined by specifying a performance 
index to measure the closeness of the controlled system to its 
goal. There is an infinite number of possibilities in the 
choice of a performance index. The choice of a particular 
index depends on the system and the desired results. In most 
cases the choice of index involves a compromise of minimizing 
the system costs while maximizing the system performance. 

The major drawback of performance indexes is while they 
specify the cost of system operation they do not give 
information about the transient response of the system [Ref. 
1: p.793] . A system that operates optimally according to the 
performance index may have undesirable transient 
characteristics. The transient response of the system must be 
analyzed to validate the choice of weighting matrices used in 
the index. 

2. Adaptive Control Systems 

An adaptive control system may include the following three 
functions : 

• Classification of the dynamic characteristics of the 

plant . 

• Decision making based on the classification of the plant. 

• Modification based upon the decisions made. 
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If the plant parameters are not exactly correct, then the 
initial classification, decision, and modification procedures 
will be insufficient to optimize the performance index. It is 
then necessary to continuously carry out these procedures 
throughout the period of operation. This constant redesign of 
the system identifies an adaptive control system [Ref. 1: 
p.793-794] . 

C. NEURAL NETWORKS IN ADAPTIVE CONTROL 

While control theory for linear time- invariant (LTI) 
systems is a relatively mature field, nonlinear control 
systems generally must be designed on a system- to- system 
basis. Neural networks, with their inherent adaptability, 
have the potential for wide application in nonlinear control 
systems. The ability of a neural network to be trained 
suggests that a neural network may be able to successfully 
control nonlinear systems which have poor performance when 
regulated by linear time -invariant controllers. This thesis 
will investigate the use of neural networks in conjunction 
with LTI control methods to construct a nonlinear regulator 
[Ref. 2: p.410]. This regulator will be implemented on a 
system that may be modeled as LTI with an added nonlinearity 
which either makes control by LTI methods inefficient or 
unstable . 

Two neural networks shall be used in the formulation of 
the regulator: one in parallel with the estimated plant 
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parameters to generate a better state vector estimate and one 
in parallel with the linear control to produce an improved 
control input to the system. Chapter II discusses a general 
neural network structure and the derivation of the nonlinear 
regulator. The performance of the regulator on a particular 
system is presented in Chapter III and conclusions are given 
in Chapter IV. 
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II. BNN NONLINEAR ADAPTIVE CONTROL 



A. BASIC CONCEPTS OF NEURAL NETWORKS 

Artificial neural networks are made up of elements which 
operate in a manner analogous to the most simple functions of 
biological neurons. These elements are linked in a fashion 
which may be similar to the connections within the human 
brain. Whether or not artificial neural networks actually are 
representative of the construction of the brain, they do show 
characteristics which are reminiscent of the human brain. An 
artificial neural network may be trained, it can recognize 
patterns, and it may apply its learning from past lessons to 
new data. These traits are quite limited, however they lend 
themselves to a wide field of applications. 

1. The Artificial Neuron 

The artificial neuron is meant to mimic the first- 
order characteristics of the biological neuron. A set of 
inputs is applied, each input representing the output of 
another neuron. Each input is multiplied by a corresponding 
linkweight and all of the weighted inputs are summed. This 
sum is the activation level of the neuron. Figure 1 shows a 
detailed representation of a single neuron. 

In Figure 1 the inputs are labeled x,, x 2 ,...,x n . 

These inputs are defined collectively as the (n x 1) column 
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vector x Ml . 



Each input is 



multiplied by its 
corresponding linkweight a,, 
a 2 ,...,a n (row vector a, size 
(1 x N ) ) and applied to the 
summation node [Ref. 3: 
pp. 12-14]. The summation 
node produces a scalar output 
z" which may be stated in vet 




Figure 1. A Neuron 
:or notation: 



z/=aj t** 1 . (2-1) 

The output of the summation node, zf* , is further 
processed by an activation function f, to produce the neuron's 
output signal x 



xf = f(z?) 



( 2 - 2 ) 



There are numerous activation functions such as a 
simple threshold, a hard-limited linear function, the 
hyperbolic tangent, and the sigmoid. These functions are 
shown in Figure 2. The main purpose of the activation 
function is to serve as a nonlinear gain so that each neuron 
maps a wide range of inputs into a bounded output. 

2 . Single Layer Neural Networks 

By themselves, neurons have limited capabilities, 
unless they are grouped together in layered networks. Figure 
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Figure 2. Various Activation Functions 

3 shows a single layer neural network with its inputs on the 
far left of the figure and the outputs on the right. The 
single layer neural network only performs the vector 
multiplication; there is no activation function. 

The circles on the left of Figure 3 serve only as 
distribution points for the inputs; they do not perform any 
calculations so they are ' not neurons. Each element of the 
input vector x is applied to each neuron. The linkweights may 
be considered as a matrix a with m rows and n columns where m 
is the number of outputs and n is the number of inputs . The 
number n is one greater than the dimension of the input vector 
x due to the bias term. Using equation (2-1) , z is the (m x 
1) output vector: 
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(2-3) 



The non- zero bias term is added as an input to each neuron. 
This bias input is also multiplied by its own linkweight. 

3 . Multiple Layer Neural Networks 

Neural networks with multiple layers offer greater 
computational capabilities than single layer networks. These 
networks are formed by cascading a number of single layers 
with the outputs of one layer providing the inputs to the next 
layer. These middle layers are known as hidden layers. 
Figure 4 shows a multiple layer neural network. The key to 
providing the extra power of the multiple layer networks is 
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Layer 2 




Figure 4. Multiple Layered Neural Network 

that the activation function is included in the hidden layers 
of the network. Otherwise, the multiple layer network could 
be modeled by a single layer network which had a linkweight 
matrix equal to the product of the individual linkweight 
matrices [Ref. 3: p.19] . The output layer usually has the 
function f (x) = x, waiving the limits of the activation 
function. 

4. Training of Neural Networks 

A network is trained so that a set of inputs will 
produce a desired set of outputs. Training is accomplished by 
applying an input vector, computing the output vector, 
comparing it to the desired vector, and modifying the 
linkweight-s by a predetermined algorithm. As the network is 
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trained, the linkweights will converge to values which will 
produce the desired output vector. 

There are several methods of training a neural 
network, one of which is the backpropagation routine. This is 
the algorithm used to train the neural networks used in this 
thesis. The next section discusses this technique. 

5. Backpropagating Neural Networks 

The backpropagating neural network is structured as 
shown in Figure 4. The signal flows from input to output. We 
assume no connections between the neurons of a layer and no 
feedback from any layer to the previous layers. 
Backpropagation refers to the method used to adjust the 
linkweights throughout the neural network. 

The output of a neuron in the last layer (xf*) is used 
with a desired output {x, p ) to produce an error signal (e") . 
This error signal is multiplied by the first derivative of the 
activation function for that neuron. Mathematically, 

*$ l • /'<*,*> 

- £'{Xj) (e“) . 



Then 6 is multiplied by the output from the source neuron for 
the linkweight which is to be updated and in turn is 
multiplied by a scaling factor the learning rate. This 
results in 
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(2-5) 



and 



where 



A "" 1 



- 6 ?: 



ij 



x 



M-l 



j ": 1 

*ij. t+1 



= a ": 1 



, £ -H A 



M-l 

ij 



( 2 - 6 ) 



• is the linkweight from neuron i in the layer M-l to 
neuron j in the output layer AT, 

• by 4 ' 1 is the value of 6 for the linkweight a^*' 1 , 

• t+1 denotes the updated linkweight. 

The linkweights in the hidden layers cannot be trained 
by this process, since there is no available target vector. 
Backpropagation trains the hidden layer linkweights by 
propagating the output error back through the network, 
adjusting the linkweights for each layer. Equations (2-4) , 
(2-5) , and (2-6) are still used for the hidden layers, but the 
target vector must be generated differently. The 6 is 
calculated for each neuron in the output layer, as in Equation 
(2-4) . The linkweights feeding the output layer are adjusted 
using Equations (2-5) and (2-6) . The is propagated back 

through the network to generate a value for 6 for each neuron 
in the hidden layer. These values are used to adjust the 
weights of the preceding hidden layer, all the way back to the 
linkweights that act upon the inputs. 

This is most easily shown in vector notation. The 
vector of <5's for the output layer is defined as D M _, and the 
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set of linkweights for the output layer as the matrix A M _,. To 
find D u , 2 , multiply D u _ t by the transpose of the matrix A M _ 2 . 
Then multiply each component of this vector by the first 
derivative of the activation function for the corresponding 
neuron in the hidden layer. This yields the vector of 6's (P M . 
2 ) for the hidden layer. Mathematically, 

D m . 2 = [D^aJL J . * [f'(3“- x ) ] , (2-7) 

where .* denotes component -by- component multiplication of 
arrays [Ref. 3: pp. 51-53] . 

B. INTEGRATION OF NEURAL NETWORKS AND ADAPTIVE CONTROL 

As suggested earlier, the versatility of neural networks 
makes them prime candidates for nonlinear controllers. The 
vast majority of LTI systems are able to be controlled by 
linear quadratic regulators (LQRs) . However, the LQR itself 
may not exhibit satisfactory performance in the presence of 
large uncertainties. That is, inaccuracies in the estimation 
of system parameters or nonlinearities in the system may cause 
the LQR to lose its optimality. In particular, nonlinear 
system dynamics can not be accounted for in LQR design. 

The proposal here is to derive a nonlinear adaptive 
regulator which uses neural networks to compensate for the 
nonlinear dynamics. This regulator will include two 
backpropagating neural networks (BNN's) : one for modeling and 



12 



one for control. The modeling BNN is used to find a more 
accurate state vector estimate than that found using the given 
system parameters. The control BNN modifies the control input 
by adding nonlinear terms to the LQR. 

1. Linear Quadratic Regulator 



a performance index known as the cost function. Consider the 
LTI system: 



where x, - (x 7 ,, x^,, ... , x^) r is the n-dimensional state 
vector, - {u u , u 2jl ... , u BI>/ ) r is the m- dimensional control 
vector, and A and B are the system parameters of dimensions (n 
x n) and (n x m ) , respectively. The LQR is a regulator which 
minimizes the cost function: 



where Q is an (n x n) symmetric and positive semi -definite 
matrix and R is an (m x m) positive definite matrix. The 
optimal control u, is defined by 



The LQR is a state feedback controller which minimizes 



= AX C + BU C , 



( 2 - 8 ) 




( 2 - 9 ) 




( 2 - 10 ) 



where 
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K = [B t PB*R]' x B t PA. 



( 2 - 11 ) 



K is the (in x n) optimal feedback gain matrix and the (n x n) 
matrix P is the solution to the algebraic Riccati equation 
[Ref. 4: p. 320] : 

P = A t PA + Q-A t PB[B t PB + R]- 1 B t PA. (2-12) 

Obviously, the LQR does not take nonlinear system . 
dynamics into consideration, but it does give a good baseline 
for comparison to the nonlinear regulator derived in the next 
section. 

2. Backpropagating Neural Network Design 

The system with the nonlinear regulator is shown in 
Figure 5. The modeling and control BNNs are labeled BNNM and 
BNNC, respectively. The modeling BNN is connected in parallel 
with the estimated plant parameters, labeled A e and B e . The 
control BNN is connected in parallel with the LQR, labeled K. 
The block labeled D denotes a unit time delay. 

The nonlinear plant will be modeled as a linear time- 
invariant system with a nonlinearity added: 

x cn = Ax t . + Bu c + v c (2-13) 

where v t is the n-dimensional nonlinear function. First, a 
linear estimate of the state equation is made: 
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Figure 5. Regulator Block Diagram 

*t*l = A e*t +B e U C (2-14) 

The estimates of the system parameters may be computed by any 
number of methods. This linear estimate will deviate from the 
actual system output due to the nonlinear system dynamics and 
the inaccuracies in parameter estimation. 

In order to compute a more accurate estimate of x, +1 , 
the modeling BNN is used to produce an adjustment to The 

modeling BNN has the (m + n) -dimensional input vector (u,, j^) 
and the n-dimensional output vector 6x, +1 . Therefore, the new 
estimate of x, +l is: 
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where 






( 2 - 16 ) 



The modeling BNN is trained so that the error between x, +1 and 
is minimized. 



and B e found in equation (2-14) and user-defined weighting 
matrices Q and R. The linear control input is computed: 



As discussed earlier, this controller is no longer optimal due 
to the nonlinear system dynamics and the parameter 
inaccuracies. The control BNN is used in order to optimize 
the control input. The control BNN has the n- dimensional 
input vector (jq) and the m - dimensional output vector du, . The 
control input is generated by 



The optimal feedback gain matrix K is computed from A e 




( 2 - 17 ) 



u t = u c + 5u t 



( 2 - 18 ) 



where 



5u t = h{x t ) . 



( 2 - 19 ) 



The control BNN is trained so that the control input u, 
minimizes the cost function of equation (2-9) . 



3. BNN Training Algorithms 

The following notation is used for both the modeling 
and the control BNN: 

• n refers to a particular layer of the network; 

• the ith node in the nth layer is designated node (n, i ) ; 

• AT is the total number of layers in the network, including 
the input and output layers; 

• N n is the number of nodes in the nth layer; 

• x",, is the output of the node (n, i) at time t; 

• a n tJ is the linkweight from the node (n, j) to the node (n + 

1 , 1 ). 

The functional representation of the nodefn, i) is found by: 



or 

z n = a a " 1 x n_1 (2-21) 

where a"' 1 is the ( N„ x N n .,) -dimensional linkweight matrix and 

x?. c = f(zf, c > (2-22) 



j* i 



n-l n - 1 

«i.j x j. t 



( 2 - 20 ) 



where x", f is the ith element of the vector input to layer n. 
The function f (•) is chosen to be the sigmoid function 
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fix) 



1 

l+e~ x 



(2-23) 



for the hidden layers and 

fix) = x (2-24) 

for the output layer. In accord with this notation, the 
dimensions Nj = m + n and N M = n in the modeling BNN, and N l = 
n and N M * m in the control BNN where n and m are the number 
of columns in the system matrices A and B, respectively, 
a. Modeling BNN 

In deriving the modeling BNN, the first step is to 
determine the error function which will be used for training. 
Since the goal is to produce an accurate estimate of x t+l , the 
error function is chosen to minimize the difference between 
x, +I and 

= \ ii t . x -x t . x )H* t ^-x t . x ) . (2-25) 



The linkweights are updated using a version of the gradient 
descent algorithm: 



~ a i,j,c 



dE, M 



c+i 






(2-26) 
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where n M is a step size parameter which controls the learning 
rate of the linkweights. The partial derivative of the cost 
function is computed as follows: 



where 



s ^ r (j£ — jp ) r ( — jp ) i 

- „ ^ n L -5 *t*i> v, *c»x ■*e«a' J 

dai.j 3a/; J 2 



9 (jt -x ) T — 



s (jfc —X ) ^ 



d a i,j 

d *c*i 

tef.J 



(&C*1 X t*l) 



(2-27) 






a 



(x e+ i + 5x e+1 ) . 



(2-28) 



Since x; +i is independent of the linkweights: 



Bit, 



C>1 _ 



3a 3a /V 



Ox 



C+l 



(2-29) 



aai! 



—g(u ef x c ) 



The function g is the mapping function for the layers of the 
network described earlier. 

Therefore, 



ag& 

tef.j 



(x c+1 - ) 









flr(« e * x t ) 



(2-30) 
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The computation of (dg/da) is found next [Ref. 2: 



p.412] : 



-i—g(u t , x c ) » A5. t x/ #t (2-31) 

d *i.j 



where 



w n . 



i, e = f (Zi.t) 2^ a *.i 
*•1 



(2-32) 



The function f'(x) denotes the first derivative of f (x) at x. 
The initial condition on A is 



A T.\ = [0,..., 0,1,0 0] r . 



(2-33) 



Therefore, the linkweight training is accomplished as: 



a i!i. fi - * ji w U c *i-Jr t *i) T Al t x/. t (2-34) 



b. Control BNN 

Finding the error function for the control BNN is 
more involved than for the modeling BNN. If the plant was an 
LTI system, the control input 14 which minimizes 

E c c = (z^Px^ + u^u,) (2-35) 

is given by equations (2-10) through (2-12) [Ref. 2: p.413] . 
This implies that the LQR generates a control 'input which 
minimizes equation (2-35) for each time t. This control input 
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is suboptimal even if the system is LTI. The nonlinearities 
in the system make the LQR even less optimal. The control BNN 
is connected in parallel with the LQR to adjust for control 
errors caused by parameter inaccuracies and nonlinear system 
dynamics . Therefore an error function must be chosen which 
makes the output of the control BNN equal zero if the plant 
parameters are exactly expressed by A e and B e and the 
nonlinearities are set equal to zero. Equation (2-35) is a 
good choice for the error function at time t, but x, +] is not 
available at time t. It is replaced by x, +1 which is found 
with the modeling BNN. The error function for the control BNN 
is now 

E t = (f t r n Pi cn +a?J?u c ) . (2-36) 



The linkweight for the control BNN is defined as 
if tJ in order to differentiate it from the modeling BNN 
linkweight, a n lJ . The control linkweights are updated by the 
gradient descent algorithm: 



bi.j.t * i = ~ Pc 



dE, c 



t + 1 



db i.j 



(2t37) 



where /x c is the learning rate parameter. 
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Equation (2-19) defines the nonlinear function for 



the control BNN. This is used to calculate the training 



algorithm as follows: 








a^e*i _ 


3 


[4- (im^ +uJj?Uj 


tofj = 


toil* 








p 




3u c 




abi'i 


to* i ' 



( 2 - 38 ) 



First, (dXt+j/db) will be derived followed by the derivation of 
(dUf/db) . Substituting equation (2-15) into (9^ +i /3b) gives: 



tof.j 



a 

toilj 



(x c+1 + 5x et i) . 



( 2 - 39 ) 



Next, equations (2-14) and (2-18) are used: 



a* 



t+i _ 



*>i.j db °j 



(A 0 x e + B 0 u e + 6x e . 1 ) 



db? 



— (A <f x c +B a (u c + 6u c ) +#x erl ) 



( 2 - 40 ) 



but X; and u, are independent of b, therefore 

_ a 



dbi.j 



(B e 5u c + 6x c+1 ) . 



( 2 - 41 ) 



Substituting in equations (2-19) and (2-16) results in 



= B ah(x c ) ^ ag(u e , x e ) 

dbf'j e db?^ db"_ j 



( 2 - 42 ) 
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The second partial derivative in equation (2-42) 

is the partial derivative of a function of 14 which is itself 

dependent upon jb, therefore the derivative must be split as 

» 

follows : 



_ B dh(x e ) ^ agr(u c ,x c ) au c 

3b? j ' * dbf.j *>e dbf.j ’ 



(2-43) 



But the partial derivative of 14 may be further simplified by: 



du e _ 8(u c + 6u c ) 

a K, dbfj 

a(flu e ) 

’ 3b?.j 

_ 3A(x e ) 
tei.l ’ 



(2-44) 



Substituting equation (2-44) into equation (2-43) : 



d&t+i m B 3A(x c ) ^ ag(u e> x e ) ah(x e ) 
to/i " 9 3b/ * 8u c dbPj 

. ( a . dg(u t ,x c ) \ dh(x t ). 

"r a “ C J db? tj • 



(2-45) 



Using equation (2-31) , (dh(x,) /db) may be computed in the same 
manner as (dg ( 14,14) /da) . The (n x m) matrix (dg( 14,14) /du,) is 
found by [Ref. 2: p.413] : 



ag(u e/ x e ) 

tec 




(2-46) 
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The control linkweight training algorithm may be 



summarized as: 



bUt.i-br.j. t -vL c 






dg{u c ,x t ) \ Tn 

-dr-h* 



Al.'Xf.c (2-47) 



The training algorithm for both BNNs may be 
summarized as follows: 



• observe the state vector jq at time t, 

• update the linkweight matrices a, using equation (2-34) , 

• compute the control input with equations (2-17) , (2-18) 
and (2-19) , 

• calculate the state vector estimate % +1 using equations 

(2-14), (2-15), and (2-16), 

• update the linkweight matrices Jb, using equation (2-47) . 

This is repeated at each successive time t, as the new state 
vector becomes available. 
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III. SIMULATION RESULTS AND DISCUSSION 



A. NONLINEAR SYSTEM DESIGN 

In order to test the LQR and the nonlinear regulator, a 
nonlinear system was required to be modeled. This system was 
modeled with a linear part which was then acted upon by a 
nonlinear function. The linear portion was artificially 
estimated to provide a more realistic case study. The linear 
system used as a baseline was first presented as a transfer 
function in continuous time, transformed to a state space 
representation, and then converted to discrete time. 

The chosen transfer function was unstable with poles at 
+2.00 and -1.00, a zero at +0.10, and a gain of 10: 

Y(s) _ lO(s-O.l) ,3.x 

U(s) (s + 1) (s- 2 ) ' 



The equivalent of this transfer function in state space format 
is [Ref. 1 : pp. 675 - 677 ]: 





0 1 ' 




10 ' 


jr = 




y + 






2 1 




9 , 



= A c y+B c u. 
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This equation is converted to discrete time using a truncated 
infinite series to calculate the state transition matrix and 
input matrix [Ref. 4: pp. 123-127] with a sample time of 0.1 
seconds. The discrete time equation is: 



y t + x = 



1.0104 0.1055 
0.2110 1.1159 
* Ay t +Bu t . 



y t + 



1.0500 

1.0533 



u. 



(3-3) 



Estimated values ( A e and B e ) were found for the purpose of 
illustrating the relative robustness of the LQR and the 
nonlinear regulator. These values were found by taking the 
members of A and B and calculating values which differed from 
the actual numbers by ±10-20 percent. 

The nonlinear part of the system equation was modeled as 
follows : 

= yi. t+1 +€ (i- ex P(y 2 , t *i) ) (3 _ 4) 

* 2 .fi = y 2 , t *i +€ (i- ex P<yi, t *i>) • 



The value e is a control on the level of nonlinearity. 



B. LQR SIMULATION RESULTS 

The LQR is calculated using A e and B e . The weighting 
matrices Q and R are set at : 



Q = 



0 0 
0 1 



and R = 1 . 



(3-5) 
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The solutions to equations (2-11) and (2-12) are: 



K = [0.1775 0.5057] 



and 



0.0771 0.1060' 
0.1060 1.3859. * 



(3-6) 



Figures 6 and 7 illustrate the performance of the LQR to 
a unit impulse. Figure 6 shows the system response for e = 
0.05. The state variables quickly converge to zero. Figure 
7 shows the response to the same input for e = 0.20. It can 
be seen that the LQR is not sufficiently robust to cause the 
state variables to converge to zero. The plant converges to 
a nonzero equilibrium value. 

C. NONLINEAR REGULATOR SIMULATION RESULTS 

Both BNN's used here are three-layered neural networks 
with 30 nodes in the hidden layer. In the modeling BNN, N } = 
m + n - 3, N 2 - 30, and N 3 - n - 2. In the control BNN, Nj = 
n - 2, N 2 - 30, and N 3 = m - 1. The learning rate for both 
BNN's was set at' 0.05. The initial condition on the 
linkweight matrices was to fill them with small random numbers 
(normal distribution, standard deviation = 0.1). 

Figure 8 plots simulation results for the nonlinear plant 
with e = 0.05. The results are quite similar to those of 
Figure 6. For this level of nonlinearity, the nonlinear 
regulator causes the state variables to converge to zero 
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System Outputs ** System Outputs 




6. LQR Result for € » 0.05 




Figure 7. LQR Result for e 



0.20 



faster than the LQR, but the response is slightly more 
oscillatory. 




Figure 8. Nonlinear Regulator Result for e = 0.05 

Figure 9 shows simulation results for the nonlinear plant 
with e = 0.20. Comparing these results to those of Figure 7 
shows that the nonlinear regulator reduces the state variables 
to zero in a case where the LQR is unable to do so. While the 
LQR stabilizes the system for relatively large factors of e, 
it is of limited utility once the state variables no longer 
converge to zero. The state variables converge to zero using 
the LQR up to e = 0.18 while the nonlinear regulator continues 
to drive the state variables to zero up to e = 0.21. The 
performance of the nonlinear regulator is better than the LQR 
for 6 a 0.06. 
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o a 




Time (sec) 



Figure 9. Nonlinear Regulator Result for e = 0.20 



1. BNN Parameter Variation 

The next area of investigation was to vary some of the 
parameters of the neural network to examine the effects upon 
the regulator. . This was accomplished by holding all 
parameters constant with the exception of the one in question. 
The performance index W is found by summing the absolute 
values of the state variables (x It and x 2t ) for the specified 
time range. The performance index W was calculated for each 
instance, again for a unit impulse over a range of 20 seconds. 

One of the major variances in the performance of 
neural networks is caused by the number of nodes in the hidden 
layer of the network. This number ( N 2 ) was varied from 10 to 
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40. The results are plotted in Figure 10. The straight line 
(without point markers) shows the performance of the LQR with 
no neural network. Figure 10 shows that adding more nodes to 
the hidden layer improves the performance of the regulator. 
However, when e is small, the LQR yields better performance 
than the nonlinear regulator with any number of nodes. As e 
increases above 0.15, the performance degrades and the number 
of nodes has less of an effect. There are limits upon the 
number of nodes used. When the regulator was tried with 50 
nodes in the hidden layer, the system became unstable. 




Figure 10. Variation in W due to Number of Nodes 



The learning rate was also varied' to determine its 
effect upon regulator performance. This step size parameter 



31 



had an effect on W similar to that of varying N 2 . Both the 
modeling and the control BNN's had their learning rate changed 
to the same value. Figure 11 shows the effect of this upon W. 
An interesting feature of Figure 11 is that while increasing 
the learning rate caused better performance, this was only 
true up to n = 0.05. When /i was increased beyond this, 
performance was degraded. Increasing fi to 0.10 caused the 
system to become unstable. 




Figure 11. Variation in W due to Learning Rate 



2 . BNN Training 

One of the advantages of this regulator design is that 
there is very little training required in order to achieve 
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convergence of the linkweight values. All simulation runs 
were completed with a single unit impulse training run. One 
of the major drawbacks to the majority of neural network 
applications is the amount of training inputs required for 
convergence [Ref. 3: p.88] . The apparent reason for this 
rapid convergence is that the BNN's are not providing all of 
the state vector estimate and control input: they are only 
adding a correction to the linear state vector estimate and 
the LQR control input . 
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IV. CONCLUSIONS 



A. SUMMARY 

Starting with the rationale for developing an adaptive 
regulator for nonlinear systems, the development of a 
nonlinear regulator which used backpropagating neural networks 
in conjunction with a linear quadratic regulator in an 
adaptive control system was proposed. The regulator was 
derived and applied to control a representative nonlinear 
system. 

B. SIGNIFICANT RESULTS 

Simulations of the BNN-based nonlinear regulator were 
conducted on a representative nonlinear system. The main 
observations from these simulations are summarized below: 



• The results show that the nonlinear regulator works well 
in the control of a nonlinear system. It was also seen 
that using a LQR on the system works if the nonlinearity 
is small. 

• There are many variables involved in the regulator design 
which must be optimized by trial and error. Definitive 
rules to govern the selection of these variables would 
significantly _ decrease the time to arrive at an 
appropriate controller. 

• The amount of training required for the regulator is 
minimal. One pulse is sufficient to train the network. 

• This regulator is unproven for all nonlinear systems. Its 
utility may be limited to those which may be modeled as a 
linear system with an added nonlinear component. 
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C . FURTHER RESEARCH 



The emphasis in this thesis was to develop a BNN- based 
nonlinear regulator that would use the a priori knowledge of 
system parameters to find a controller that could function 
more efficiently than one which did not utilize this 
knowledge. The neural networks used in this research only 
contained one hidden layer. While this was sufficient, the 
rules used in deriving the BNN' s for the regulator could be 
used to design neural networks with numerous hidden layers. 
Guidelines should be developed which would govern the choice 
of the number of layers used in the neural networks, the 
number of nodes necessary in the hidden layers, the learning 
parameters, and the weighting matrices for the error 
functions. 
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APPENDIX A: NONLINEAR REGULATOR SIMULATION 



A. SYSTEM DERIVATION 

A nonlinear regulator using backpropagating neural 
networks in conjunction with a linear quadratic regulator was 
designed in Section B of Chapter II. This appendix details 
the procedure used to arrive at the simulation results of 
Chapter II and includes the software written for the research. 

The representative nonlinear system chosen for examination 
has the continuous domain transfer function: 

Yjs) m 10 (s-0 . 1) _ lOs-1 

tHs) (s + 1) (s-2) " s 2 -s- 2' 



This equation must be transformed to state space format. This 
is accomplished using the derivation shown below [Ref. 1: 
p. 675-677] . 



Y(s) = s a *Jb 1 s a - 1 »-»j? n - 1 s»Jb n 
U(s) s° + a 1 5 n ‘ 1 +- + a n . 1 s + a n 



This may be rearranged and transformed to an nth- order system 
of linear differential equations: 

y (n) +a 1 y ln ~ 1) +- + a n y = b 0 u (n) + b ± u (n_1) +•••+ b n u (A-3) 
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where y <n> and u (n) are the nth order derivatives of y and u. The 
state variable is chosen: 



*1 = y 
X 1 = *2 
^2 ~ X 3 



(A- 4) 



■ n = _ 



~ a n -\ x 2 a i'Vj 



X + 



b 0 u 



(n) 



+ J b 1 u (a_1) +••• +b n ii. 



However, x t = y may not yield a unique solution due to the 

derivatives of the forcing function. The state variable are 

# 

redefined as 

- y- P 0 u 

x 2 * y-Po^-PiU = *i-Pi u 

X 3 = y-Po ,:} -Pl li -p2 U = *2~p2 U (A ‘ 5) 

= y (n- 1) -p oU (n-D -...-p n . lU = Vl-Pn-lU 

where jS 0 , j 8 y/ ...,j3 n are determined from 

Po - b o 
Pi = ^l“-3lPo 

P 2 = i? 2 - a iPi- a 2 Po (A-6) 

Pn = b n — ^lPn-l ~ a n-lPl ~ a /iPo ■ 

With this choice of state variables, the following state and 
output equations are found: 
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where 



X = A c X + B c u 
y = C c x+D c u 



(A- 7) 







' 0 1 0 -O' 


*2 




0 0 1-0 


: 


, A c = 


: : : : 


x a-l 




0 0 0 -1 


. X n. 




~^n ~^n-l ~^a- 2 -< ^1, 



B c - 



Px 

Pa 

P a -x 
P 



a , 



C c = [1 0 ••• 0], D e P 0 b 0 . 



(A- 8) 
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B. 



SIMULATION SOFTWARE 



% 

% 

% 

% 

% 

% 

% 

% 

% 

% % 
% 

% 



%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 



LQR.M 

This program computes a linear quadratic regulator for the 
stated system and simulates the response to a unit 
impulse. No neural network adjustment is provided. 

This is a stand-alone program. 

LT Kurt Menke, 10 June 1992 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 



clear; clg 
% 



% 

% The first part of this program converts the system 
% from continuous time transfer function to discrete 
% domain state space equations. 

% 

% 

% The numbers for num and den are for the 
% representative system of Chapter III. This is set up 
% for a second order system with a single input. 

% It may be adapted for any order system using the equations 

% in Appendix A. 
num = [0 10 -1] ; 
den = [1 -1 -2] ; 

% 

% 

% This transformation to state space follows the 

% format of Appendix A. 

al = den (1,2) ; 

a2 = den (1,3) ; 

bO = num(l, 1) ; 

bl = num(l, 2) ; 

b2 = num (1,3) ; 

betaO = bO; 

betal = bl - al*beta0; 

beta2 = b2 -al*betal -a2*beta0; 

% 

% 
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% Continuous state space equations 
Ac = [ 0 1; 

-a2 -alj ; 

Be = [betal; beta2] ; 

Cc = [1 0] ; 

Dc = betaO; 

% 

%disp (' Sample time for continuous') 
dt = input ('to discrete conversion? '); 

% 

% 

% Discrete state space equations 
[A, B] = c2d (Ac, Be, dt) ; 

% 

% 

[na,ma] = size (A); 

[nb,mb] = size (B) ; 
rand ( ' uniform' ) 
rand ( ' seed' , 0) 

% 

% 

% Ar, Br, As, and Bs are used to compute the "estimate" of 

% A and B used in the computation of the LQR. 

% This ensures that Ae and Be are within +/- 10-20% 

% of A and B 

Ar =» 0.1 . * rand(na,ma) + 0.1; 

Br = 0.1 .* rand (nb, mb) +0.1; 

% 

% 

% As and Bs are arbitrary signs which make the particular 

% value of Ar/Br either +10-20% or -10-20% 

As = [-1 1; 1 -1] ; 

Bs = [-1; 1]; 

Ar = As .* Ar; 

Br = Bs . * Br; 

% 

% 

% Ae and Be are the estimates of A and B 
Ae = (Ar .* A) + A; 

Be = (Br .* B) + B; 

% 

% 

% Weighting matrices 
Q = [0 0; 0 1] ; 

R = 1; 

% 

% 

% Computes the LQR gain matrix (K) and the solution to the 
% algebraic Riccati equation (P) . 

[K, P] = dlqr (Ae,Be,Q,R) ; 
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% 

% 

% e is the coefficient which controls the amount of 
% nonlinearity seen by the system, 
e = input ( 'Nonlinearity coefficient? '); 

% 

%% Recommended run time: 10 or 20 seconds, 

time = input (' Amount of system run time? '); 

% 

% 

% Initial values 
Nt » time/dt + 1; 

xhat «* zeros (na,Nt) ; 

y = zeros (na, Nt) ; 

x = zeros (na,Nt) ; 

u = [1 zeros (l,Nt-l) ] ; % Impulse input at time 0. 

% 

% 

for t = 2:Nt 

y(:,t) - A*x(:,t-1) +B*u(:,t-1); 

% 

% State vector 

x(l f t) - y(l,t) + e* (l-exp(y (2, t) ) ) ; 

x (2 , t ) * y(2,t) + e* (l-exp(y (1, t) ) ) ; 

% 

% Control input 
u ( : , t ) - -K*x( : , t) ; 

% 

% State vector estimate 
xhat(:,t+l) = Ae*x(:,t) + Be*u(:,t); 
end; 

% 

% 

% Performance calculation 
W1 = 0; 

W2 = 0; 
for t = 1 :Nt 

Wl - W1 + abs (x ( 1 , t ) ) ; 

W2 = W2 + abs(x(2,t)); 
end; 

W = Wl + W2; 

% 

% 

% Plot of state vector response to unit impulse 
t=0:dt:time; 

plot (t,x(l, : ) , t ,x(2, : ) ) ,grid; 
text (0.8,0. 8 ,num2str (W) , 'sc') ; 
title (' Linear Quadratic Regulator'); 
xlabel('Time (sec)'); 
ylabel ( ' System Output ' ) ; 



41 



%%%%%%%%% 



% 



%%%%%%%%%%%%%%%%%%%%% 



% 

% NLREG.M 

% 

% This program computes a nonlinear regulator for the stated 
% system and simulates the response to a unit impulse. The 
% regulator uses two neural networks: BNNC and BNNM. This 

% is a stand-alone program. 

% LT Kurt Menke, 10 June 1992 
% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

% 

% 

clear; clg; 

% 



* 

% The first part of this program converts the system 
% from continuous time transfer function to discrete 
% domain state space equations. 

% 

% 

% The numbers for num and den are for the 
% representative system of Chapter III. This is set up 
% for a second order system with a single input. 

% It may be adapted for any order system using the equations 
% in Appendix A. 
num - 10 * [0 1 - .1] ; 
den = conv( [1 -2] , [1 1] ) ; 

% 

% 

% This transformation to state space follows the 

% format of Appendix A. 

al = den (1,2) ; 

a2 = den (1,3) ; 

bO = num(l, 1) ; 

bl = num(l, 2) ; 

b2 = num (1,3); 

betaO = bO; 

betal = bl - al*beta0; 

beta2 = b2 -al*betal -a2*beta0; 

% 

% 

% Continuous state space equations 
Ac = [ 0 1; 

-a2 -alj ; 

Be = [betal; beta2] ; 

Cc = [1 0] ; 

Dc = betaO; 

% 

% 
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disp (' Sample time for continuous') 

dt = input ( ' to discrete conversion? ' ) ; 

% 

% 

% Discrete state space equations 
[A, B] = c2d(Ac,Bc,dt) ; 

% 

% 

[na,ma] = size (A) ; 

[rib, mb] = size(B); 
rand ( 'uniform' ) 
rand ( ' seed' , 0) 

% 

% 

% Ar, Br, As, and Bs are used to compute the "estimate" of 

% A and B used in the computation of the LQR. 

% This ensures that Ae and Be are within +/- 10-20% 

% of A and B 

Ar = 0.1 . * rand(na,ma) + 0.1; 

Br = 0.1 . * rand (nb, mb) + 0.1; 

% 

% 

% As and Bs are arbitrary signs which make the particular 

% value of Ar/Br either +10-20% or -10-20% 

As - [-1 1; 1 -1] ;• 

Bs - [-1; 1] ; 

Ar = As . * Ar; 

Br = Bs . * Br; 

% 

% 

% Ae and Be are the estimates of A and B 
Ae = (Ar .* A) + A; 

Be =* (Br .* B) + B; 

% 

% 

% Weighting matrices 
Q = (0 0; 0 1]; 

R = l; 

% 

% 

% Computes the LQR gain matrix (K) and the solution to- the 
% algebraic Riccati equation (P) . 

[K, P] = dlqr (Ae,Be,Q,R) ; 

% 

% 

% Learning, rate parameters. LearnM is for BNNM and LearnC 
% is for BNNC. 

LearnM = 0.05; 

LearnC = 0.05; 

% 

% 
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% e is the coefficient which controls the amount of 
% nonlinearity seen by the system, 
e = input (' Nonlinearity coefficient? '); 

% 

% 

% Recommended run time: 10 or 20 seconds, 

time = input (' Amount of system run time? '); 

% 

% 

disp(' Number of nodes in hidden layer') 

Nm = input ('of modeling net (BNNM) ? '); 

% 

% 

disp ( ' Number of nodes in hidden layer') 

Nc = input ('of control net (BNNC) ? '); 

% 

% 

% Initialization of the linkweight matrices 
[al,a2] = netinitm(na, Nm, na+2 ) ; 

[bl,b2] = net ini tm (mb, Nc, nb+1) ; 

% 

% 

% Bias value for neural networks 
Bias - 1; 

% 

% 

% Initial values 
Nt = time/dt + 1; 
x = zeros (na, Nt) ; 
y = zeros (na, Nt) ; 
xbar = zeros (na, Nt+1) ; 
xhat = zeros (na, Nt+1) ; 
xdel = zeros (na, Nt+1) ; 
ex = zeros (na, Nt) ; 

u = [1 zeros (mb, Nt-1) ] ; % Impulse input at time 0 

% 

% 

% Training run 
for t = 2:Nt 

y(:,t) = A*x ( : , t- 1) + B*u ( : , t-1) ; 

% State vector 

x(l,t) = y(l,t) + e* (l-exp(y(2,t) ) ) ; 

x (2 , t) = y(2,t) + e*(l-exp(y(l, t) ) ) ; 

% Error vector 

ex(:,t) = x ( : , t ) - xhat ( : , t ) ; 

% Training of linkweights for BNNM 

[al, a2] = bpm (al , a2 , [x( : , t) ;u ( : , t) ; Bias] , ex ( : , t) , Learnm) 
% Linear control input 
ubar(:,t) = -K*x(:,t); 

% BNNC output 

udel(:,t) = netm( [x( : , t) ; Bias], bl, b2) ; 
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% Control input 

u(:,t) = ubar(:,t) + udel(:,t); 

% Linear state vector estimate 
xbar (:, t+1) = Ae*x(:,t) + Be*u(:,t); 

% BNNM output 

xdel (:, t+1) = netm( [x ( : , t) ; u(:,t); Bias], al, a2) ; 

% State vector estimate 

xhat(:,t+l) = xbar(:,t+l) + xdel(:,t+l); 

% Training of linkweights for BNNC 

[bl,b2] ■ bpc (al,a2,bl,b2,x( : , t) ,u( : , t) ,xhat ( : , t+1) , . 

Bias ,P, Be, R, Learn) ; 

end; 

% 

% 

% Resetting initial values 
x = zeros (na, Nt) ; 
y = zeros (na,Nt) ; 
xbar = zeros (na,Nt+l) ; 
xhat = zeros (na,Nt+l) ; 
xdel = zeros (na,Nt+l) ; 
ex = zeros (na,Nt) ; 
ubar * zeros (mb, Nt) ; 
udel » zeros (mb, Nt) ; 

u » [1 zeros (mb, Nt-1) ] ; % Impulse input at time 0 

% 

% ~ 

%- Simulation run 
for t = 2 :Nt 

y ( : , t) = A*x( : , t-1) + B*u ( : , t- 1) ; 

% State vector 

x(l,t) = y(l,t) + e* (1-exp (y (2, t) ) ) ; 

x(2, t) = y(2,t) + e* (1-exp (y(l, t) )) ; 

% Error vector 

ex(:,t) = x( : , t) -xhat ( : , t) ; 

% Training of linkweights for BNNM 

[al,a2] = bpm(al,a2 , [x( : , t) ;u ( : , t) ;Bias] , ex( : , t) , Learnm) 
% Linear control input 
xobar(:,t) = -K*x(:,t); 

% BNNC output 

udel(:,t) = netm( [x( : , t) ; Bias], bl, b2) ; 

% Control input 

u(:,t) = ubar(:,t) + udel(:,t); 

% Linear state vector estimate 
xbar(:,t+l) = Ae*x(:,t) + Be*u(:,t); 

% BNNM output 

xdel (:, t+1) = netm( [x ( : , t) ; u(:,t); Bias], al, a2) ; 

% State vector estimate 

xhat(:,t+l) = xbar (:, t+1) + xdel (:, t+1 ); 

% Training of linkweights for BNNC 

[bl,b2] = bpc (al, a2 ,bl, b2 ,x( : , t) , u ( : , t) ,xhat ( : , t+1) , . 

Bias ,P, Be, R, Learn) ; 
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end; 

% 

% 

% Performance calculation 
W1 = 0; 

W2 = 0; 
for t = 1 :Nt 

W1 = W1 + abs(x(l,t)); 

W2 = W2 + abs (x(2 , t) ) ; 
end; 

W = W1 + W2; 

% 

% 

% Plot of state vector response to unit impulse 
t=0 :dt : time; 

plot (t,x(l, : ) , t,x(2, : ) ) ,grid; 
text (0 . 8, 0 . 8 , mam2str (W) , ' sc' ) ; 
title ( 'Linear Quadratic Regulator'); 
xlabel('Time (sec)'); 
ylabel ( ' System Output ' ) ; 
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The following programs are the subroutines required to run 
LQR.M and NLREG.M. 



function [al,a2] = netinitm(ny,M,nu) 

% 

% Routine for initializing the neural net with 
% small random numbers. 

% 

% Function call: [al,a2] = netinitm(ny,M, nu) 

% 

% where ny = number of outputs 

% M = number of nodes in the hidden layer 

% nu ■ number of inputs 

% 

% LT Kurt Menke, 10 June 1992 
% 

rand ( ' normal ' ) 
rand ( ' seed' , 0) 
al - 0.1*rand(M,nu) ; 
a2 » 0 . l*rand (ny,M) ; 
return 



function x3 - netm(xl,al,a2) 

% 

% Routine for calculating the output of a neural net. 

% 

% Function call: x3 = net (xl,al,a2) 

% 

% where xl = neural net input vector 

% al = linkweight matrix from input to hidden layer 

% a2 = linkweight matrix from hidden layer to output 

% 

% LT Kurt Menke, 10 June 1992 
% 

x2 = sigmoid (al*xl) ; 

x3 = a2 *x2 ; 

return 
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function y = sigmoid (x) 

% 

% Routine for calculating the sigmoid of a vector input 
% 

% Function call: y = sigmoid (x) 

% 

% where x = vector input 

% y = vector output 

% 

% LT Kurt Menke, 10 June 1992 
% 

y = 1 ./ (l+exp(-x) ) ; 
return 



function y = dsig(x) 

% 

% Routine for calculating the first derivative of the 
% sigmoid of a vector input 

% 

% Function call : y * dsig (x) 

% 

% where x - vector input 

% y =» vector output 

% 

% LT Kurt Menke, 10 June 1992 
% 

temp = exp(-x); 

y = temp./ (1 + 2*temp + temp.*temp); 
return; 
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function [al,a2] = bpm(al , a2 , xl, ex, mu) 

% 

% Routine for updating the linkweight matrices for BNNM. 

% 

% Function call: [al,a2] = bpm(al,a2,xl, ex, mu) 

• % 

% where al = linkweight matrix from input to hidden layer 

% a2 = linkweight matrix from hidden layer to output 

% xl = neural net input vector 

% ex * error vector for linkweight adjustment 

% mu = learning rate for BNNM 

% 

% LT Kurt Menke, 10 June 1992 
% 

z2 = al*xl; 

x2 = sigmoid (z2); 

z3 = a2*x2 ; 

% 

parE_al = -diag(dsig (z2) ) *a2' *ex*xl' ; 
parE_a2 = - ex*x2 ' ; 

% 

al ■ al - mu.*parE_al; 
a2 =• a2 - mu . *parE_a2 ; 
return 



49 



function [bl,b2] = bpc (al, a2 , bl, b2 , x, u, xhat , Bias , P, B, R, mu) 

% 

% Routine for updating the linkweight matrices for BNNC. 

% 

% Function call: [bl,b2] = bpm(al,a2,bl,b2,x,u,xhat, . . . 

% Bias,P,B,R,mu) 

% 

% where al » BNNM linkweight s from input to hidden layer 

% a2 = BNNM linkweight s from hidden layer to output 

% x = neural net input vector 

% u = control input 

% xhat = state vector estimate 

% Bias = bias value for neural net 

% P = matrix solution to algebraic Riccati equation 

% B = system input matrix 

% R = weighting factor 

% mu = learning rate for BNNC 

% 

% LT Kurt Menke, 10 June 1992 
% 

xla = [x; u; Bias] ; 
z2a = al*xla; 
x2a = sigmoid (z2a) ; 
z3a - a2*x2a; 

% 

xlb = [x; Bias] ; 
z2b = bl*xlb; 
x2b = sigmoid (z2b) ; 
z3b = b2*x2b ; 

% 

parh_b2 = x2b' ; 

parhjbl = diag (dsig (z2b) ) *b2 ' *xlb' ; 

% 

della = a2*diag (dsig (z2a) ) ; 
parg_h = della* (sum (al' ))' ; 
parE_c = xhat ' *P* (B+parg_h) + u'*R; 

% 

deb2 = parE_c . *parh_b2 ; 
debl = parE_c . *parh_bl; 

% 

bl = bl - mu . *debl ; 
b2 = b2 - mu.*deb2; 
return 
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